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EQUALISATION OF Tl IE OUTPUT IN A STEREO WIDENING 
NETWORK 

The present Invention relates to a method for converting stereo formal 
5 signals to become suitable for playback using headphones. The 
invention also relates to a signal processing device for carrying out said 
method. The invention further relates to a computer program 
comprising machine executable steps for carrying out said method. 
Finally, the invention relates to a mobile appliance with audio 
10 capabilities. 

Already for several decades the prevailing format for making music and 
other audio recordings and public broadcasts has been the well-known 
two-channel stereo format The two-channel stereo formal consists of 

15 two independent tracks or channels; the left (L) and the right (R) 
channel, which are intended for playback using separate loudspeaker 
unite. Said channels are mixed and/or recorded and/or otherwise 
prepared lo provide a desired spalial impression to a listener, who is* 
positioned centrally in front of two loudspeaker units spanning ideally 

20 60 degrees with respeel lo Hie listener. When a two-channel stereo 
recording is listened through the left and right loudspeakers arranged in ". 
Ihe above described manner, the listener experiences a spatial 
impression resembling the original sound scenery. In this spatial 
impression the listener is able to observe the direction of the different 

25 sound sources, and the listener also acquires a sensation ot the 
distance of the different sound sources. In other words, when listening 
to a two-channel stereo recording, the sound sources seem to be 
located somewhere in front of the listener and inside the area located 
somewhere between the left and the right loudspeaker units. 

30 

Other audio recording formats are also known, which, instead of only 
two loudspeaker units, rely on the use of more than two loudspeaker 
units for the playback. For example. In a four channel stereo system 
two loudGpoaker units are positioned in front of the listener: one to the 
35 left and one to the right, and two other loudspeaker units are positioned 
behind the listener: to the rear loft and to the roar right, respectively. 
Further, a separate fifth channel/loudspeaker may be provided for Ihe 
low frequency sounds. 
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Such multichannel arrangements are nowadays commonly used, e.g., 
in computer games; in movie theatres or even in home entertainment 
systems. This allows to create a more detailed spatial impression of the 

5 sound scenery, where the sounds can be heard coming not only 
somewhere from the area located in front of the listener, but also from 
behind, or direully from the side of the listener. Recordings for theso 
multichannel systems can be prepared to have independent tracks for 
each separate channel or the information of the "extra" channels in 

1 0 addition to a normal twb-channei stereo format can also he coded into 
Ihe left and right channel signals in a two-channel stereo format 
recording. In the latter case a special decoder is required during the 
playback to extract the signals, for example, for the roar left and rear 
right channels. Digital Video Disc (DVD) products, for example, support 

15 the aforementioned multichannel sound arrangements. 

Further, some special methods are known in order to prepare 
recordings, which are specially Intended to he heard over headphones. 
These Include, for example, binaural signals that arc mado by 

20 recording .signals corresponding to the pressure signals that would be 
captured * by the eardrums of a human listener in a real listening :> 
situation. Such recordings can he made for example by using a 
dummy>head, which is an artificial head equipped wrth two 
microphones replacing the twn human Bars. When a high-quality 

25 binaural recording is heard over headphones, the listener experiences 
the original, detailed three-dimensional sound Image of the recording 
situation. Binaural signals can also bo synthesized without the need for 
making a real-life recording. 

30 The present invention is mainly related lo such general two-channel 
stereo recordings, broadcasts or similar audio material, which have 
been mixed and/or otherwise prepared to be played back over two 
loudspeaker units, which said units are intended to be positioned in the 
previously described manner with respect to the listener. Hereinbelow, 

35 the use of the short term "stereo" refers to aforementioned kind of two- 
channel stereo format. Listening to audio material in such stereo format 
played hack over two loudspeakers is hereinbelow shortly referred to 
as "natural listening". 
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Whon a stereo recording is played back over loudspeakers in a natural 
listening situation, Ihe sound emitted from the left loudspeaker Is heard 
not only by the listener's [eft ear but also by the right ear, and 
correspondingly Ihe sound emitted from the right loudspeaker is heard 
5 both by the right and left ear This condition is of primary Importance for 
the generation of a hearing impression with a correct spatial feeling. In 
other words, this is important in order to generate a hearing impression 
in which the sounds seem to originate from a space or stage outside 
the listener's head. When listening to a stereo recording over 
10 headphones, the left channel is heard in the left ear only, and the right 
channel Is heard in the right ear only. This causes the hearing 
impression to be both unnatural and tiresome to listen to, and the 
sound scenery or stage is contained entirely inside the listener's head: 
the sound is not externalised as intended. 

15 

There are reasons to support such an opinion that when a recording in 
normal stereo format is played back over headphones directly without 
any spatial conversion, the above described unnatural spatial 
Impression may cause listening fatigue. Therefore, In order to 
20 compensate for the unnatural listening conditions experienced when 
using headphones, so-called spatial enhancers, or stereo widening 
networks are known from the related art. 

Tho basic idea behind most spatial enhancers or stereo widening 
25 systems is that the sound heard by the listener over headphones 
should be very similar to the sound the listener would havo hoard, if tho 
music had been played back over two widely spaced loudspeakers. In 
other words, the stereo signals played back through the headphones 
are processed in order to create in the listener's ears an impression of 
30 the sound coming from a pair of "virtual loudspeakers", and thus further 
resembling the listening to the real original sound sources. Methods 
belonging to this category are referred later in this text as 'Virtual 
loudspeaker methods". 

35 An earlier published patent application EP 1194007 by the Applicant 
discloses a stereo widening network based on the aforementioned 
virtual loudspeaker-type approach. Said stereo widening network is 
thus capable of externalising the sounds so that the listener 
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experiences the sound scenery or stage to be located outside his/her 
head in a manner similar to a natural listening situation. 

Figure 1 illustrates schematically an example of a sleneo widening 
5 network relying on tho virtual loudspeaker approach. In order to 
conceptually understand the operation of the stereo widening network 
shown In Fig.1, one can consider the following. Input signals L and R 
represent stereo formal signals thai are in a natural listening situation 
fed directly to a pair of loudspeakers. Sound emitted by the left 
10 loudspeaker is then heard at both ears, and, similarly, sound emitted 
by the right loudspeaker is also heard at both ears. Consequently, in a 
natural listening situation there are four acoustical paths from the two 
loudspeakers to the two ears, i.e. two so-called direct paths and two 
so-called cross-talk 5 " paths. These acoustical paths have their 
15 corresponding signal paths in a stereo widening network. 

When the loudspeakers are positioned symmetrically with respect to 
the listener, the direct path from the left speaker to the left ear is the 
same as the direct path from the right speaker to the right ear, and, 

20 similarly, the cross-talk from left speaker to the right ear is the same as 
the cross-talk from the right speaker to the left ear. In Fig. 1 we denote 
the Identical direct paths by subscript 'd' and the identical cross-talk 
paths fey subscript V. The direct path and the cross-talk path each has 
a discrete-time transfer function, H d (z) and H x (z) associated with it, 

25 respectively. The cross-talk path transfer functions H x (z) Include a 
delay term, which simulates the path length difference between the 
direct and cross-talk paths. In other words. In a natural listening 
situation, for example, the sound from the left speaker arrives to the 
right ear (cross-talk path) slightly later than to the left ear (direct path). 

30 It can be readily understood, that the aforementioned delay generated 
by the stereo widening network between the direct and cross-talk paths 
plays a very important role In creating correct spatial hearing 
impression in headphone listening. As familiar for a person skilled in 
-the art, the difference between the time delays in the direct path and 

35 Ihe cross-lalk path corresponds to the irrteraural time difference (lTD). 
and the difference between the gains in the direct path and the cross- 
talk path corresponds to the interaural level difference (ILD). The ILD is 
dependent on the frequency whereas the ITD is not. 
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Unfortunately, the human auditory system is extremely sensitive to any 
modifications made to a high-quality music recording. Artifacts of any 
kind Introduced In spatial processing are readily picked up, even by 
5 rather inexperienced listeners. Consequently, it is advantageous to be 
able to ensure thai a spatial enhancer or stereo widening network does 
not do any harm to the quality of the original recording. 

One of the most prominent elements of a stereo recording is the 
10 monuphunic component. As well known for a person skilled in the art, 
the monophonlc component is the part of the signal which is common 
for both lo the L and R channels, and which is therefore in a natural 
listening situation heard at the centre ot the sound stage. The lead 
vocals on a pop recording, for example, are usually positioned at the 
1 5 centre of the sound stage. 

When stereo sound signals L f R including a prominent monophonic 
component is processed using a prior art type stereo widening network 
illustrated in Fig. 1, causes this significant attenuation of the 

20 monophonic signals at certain frequencies or frequency bands. This is 
because when a delay is added, into the cross-talk path signal by H x (7), 
in certain situations this generates a signal that has substantially 
similar ^waveform to the signal presem in the direct path hut with 
substantially opposite phase. When the direct path and cross-talk path 

25 signals corresponding to the monophonic component are summed up 
together, the aforementioned phase difference between these signals 
causes attenuation of the monophonic component at certain 
frequencies or frequency bands. Later in this text this effect is referred 
shortly to as destructive interference. 

30 

The aforementioned unwanted modification of the monophonic signal 
component as a result of the spatial processing is unacceptable to 
many listeners, and this motivates the design of a signal processing 
method that can alleviate this problem. According to the Applicant's 
35 point or view, litis problem has not been solved satisfactorily in prior art 
designs. 
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US-patent 6111958 presents audio spatial enhancement apparatus 
and methods, which try to reduce Ihe unwanted effects of the spatial 
processing to th mondphonic component by generating a pseudo- 
stereo signai prior to Ihe aclual spatial broadening- The 
5 aforementioned document refers to the so-called sum-difference 
processing which does not insert any binaural cues, and which is 
therefore not relevant to headphone listening applications. 

WO-publlcation 97/00594 discloses method and apparatus for spatially 
10 enhancing stereo and monophonic components. This solution, which is 
based on the use of analog electronic circuits, utilizes also the idea of a 
pseudo-stereo signal synthesized from the monophonic signal in order 
to further spatially enhance the monophonic component. Such 
approach/ however, leads to unavoidable degradation of the quality of 
15 the original recording. 

The main purpose of the present Invention Is to introduce a novel and 
simple solution for spatial processing of stereo format signals to 
become suitable to be played back using headphones in a manner 
20 ensuring that also the monophonic component of said stereo signals 
can be perceived substantially free of disturbing artifacts. In broad 
sense, the invention is applicable to such situations where the stereo 
format audio material is to be listened using headphones, i.e. the audio 
0 V material is provided as separate left and right channel signals. The 

if ?5 audio material may have been provided directly as a two-channel 

stereo recording, or it may have been converted to such a two-channel 
format from some ulher format known as such. 

* • 

\ % The current invention specifies a signal processing approach. 

30 preferably based on digital signal processing, for equalizing the output 
from a spatial enhancer system In such a way that the amplitude 
spectrum of the monophonic component of the output signals can be 
maintained flatter than in some prior art methods. This ensures that the 
spatial impression of the spatially enhanced signals in a headphone 

V* 35 listening situation can be perceived as substantially free of artifacts. 

l his desired effect is produced by adding energy to the signals output 
« * from the spatial enhancer, in a slightly delayed manner relative to the 

direct sound, and within that frequency band where the monophonic 

fc, 

If 

t n 
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signal component needs boosting in order to compensate for the 
attenuation caused by the above explained destructive interference. 
According to a preferred embodiment of the invention the gain that 
determines the level of the added energy can be varied in real-time 
5 according to the strength of the monophonic component of the original 
stereo signals. 

To attain these purpuses, the method according to the Invention is 
primarily characterized in what will be presented in the characterizing 

10 part of the independent claim 1. The signal processing device 
according to the invention is primarily characterized in what will be 
presented in Uie characterizing part of the independent claim 9. The 
computer program according to the invention is primarily characterized 
in what will be presented in the characterizing part of the independent 

15 claim 19. The mobile appliance with audio capabilities according to the 
invention is primarily characterized in what will be presented in the 
characterizing part ot the independent claim 21 . 

The other dependent claims present some preferred embodiments of 
20 the invention. 

According to one Interpretation the Invention can be considered as kind 
of an add-on module, or as a "third" channel separate from the spatial 
enhancer or stereo widening network itself. This module or channel 

25 equalizes the output from the spatial enhancer in a certain way In order 
to eliminate or minimize the artifacts otherwise caused by the variation 
nf the amplitude spectrum of thfi monophonic component. Therefore, 
listeners will not perceive a significant decrease in sound quality when 
the invention is applied to spatial processing otherwise used to 

30 enhance high quality music recordings for headphone listening. 

The problem related to the behaviour of the monophonic component in 
spatial enhancement for headphone listening has not received very 
much attention previously. In fact most spatial enhancers according to 
35 the related art attempt to achieve a quile dramatic, and therefore rather- 
unnatural effect, and it is usually claimed that listeners prefer this. 
However, it is the understanding of the Applicant that in the case of 
high-quality music recordings this Is not unconditionally true. Even 
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though preferences vary between individual listeners, there can be 
tound evidence to suggest that many listeners prefer a clean, and 
therefore natural sound to a heavily processed and spatially "overrieir 
sound. 

5 

The current invention is the first to apply a design constraint, which is 
related to the sound quality in an objective way. The method and 
devices according to the invention are more advantageous than prior 
art methods and devices in avoiding/minimizing unwanted and 
10 unpleasant colouration of the reproduced sound especially in the case 
of high-quality and high-fidelity audio material. 

The method according to the invention is especially suitable to be 
applied together the stereo widening network developed by the 
15 Applicant and described in the aforementioned patent application bP 
1194007. 

However, it should be understood that the invention can be applied 
together with a wide variety of stereo widening or corresponding spatial 
20 signal processing methods, where at least one delay introducing cross- 
talk signal path is termed between the left and right channel direct 
signal paths, and thus the aforementioned destructive interference 
effects may attect the quality of the sound. 

25 t he method according to the invention may be implemented using both 
hardware or software based systems. A considerable advantage of the 
present Invention is that It does hot degrade the excellent sound quality 
available today from digital sound sources as for example 
CqmpactDlsk players, MlnlDlsk players, MPS- and AAOplayers and 

30 digital broadcasting techniques. The processing scheme according to 
the Invention Is also sufficiently simple to run In real-time on a portable 
device, becauso it can be implemented at modest computational 
expense. 

35 During the last decade the aforementioned digital portable and 
personal audio appliances have become increasingly popular. This 
development has, among other things, strongly increased the use of 
headphones in the listening of music recordings, radio broadcasts etc. 
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However, the commercially available music recordings and other audio 
material are still almost exclusively in the two-channel stereo format, 
and thus intended for playback over loudspeakers and not over 
headphones- the current invention provides a solution for converting 
5 such audio material for headphone listening without degradation of the 
original high sound quality. The Invention nan be implemented in a 
wide variety of different type of portable audio appliances including also 
different type of wireless communication devices. 

iu The preferred embodiments of the Invention and their benefits will 
become more apparent to a person skilled in the art through the 
description herelnbelow, and also through the appended claims. 

In the following, the ; invention will be described in more detail with 
1 5 roforonoe to the appended drawings, in which 

Fig.1 illustrates schematically a basic prior art type stereo 
widening network relying on the virtual loudspeaker 
approach, 

20 . y 

Fig. 2 illustrates schematically the basic idea behind the present 
invention. 

Fig. 3 illustrates schematically a stereo widening network together 
25 with a monophonic equalizer module according to the 

Invention, 

Fig. 4 exemplifies the magnitude response of the monophonic 
component of a stereo widening network without 
30 equalization, 

Fig. 5 exemplifies the magnitude response of the monophonic 
component of a stereo widening network equalized 
according to the invention, 



35 



Fig. 6 exemplifies the impulse response of a monophonic 
equalizer module realized using a second order IIR filter, 
and 
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Fig. 7 exemplifies the magnitude response uf a monophonic 
equalizer module realized using a second order MR titter. 

5 Figure 1 shows a basio prior art typo stereo widening network SW 
relying on the virtual loudspeaker approach. As discussed already 
above, the direct paths are denoted by subscript 'd' and the cross-talk 
paths by subscript Y. The direct path and the cross-lalk path each has 
a discrete-time transfer function, H d (z) and H£z) respectively. The 

10 cross-talk path transfer functions H x (z) include a delay term in order to 
croato proper spatial hearing impression. The aforementioned patent 
application EP 1194007 by trie Applicant discusses the operation of 
such a stereo widening network, and especially its preferred balanced 
embodiment in more details. 

15 

Figure 2 shows schematically a situation, where the stereo signals L,R 
are fed to a pair of loudspeakers positioned at straight left and straight 
right relative to the listener. When the loudspeakers are positioned 
symmetrically with respect to the listener the direct path from the left 

20 speaker to the left ear is the same as the direct path from the right 
speaker to the right ear, and, similarly, the cross-talk from the left 
speaker to the right ear is the same as the cross-talk from the right 
speaker to the left ear. I heretore, the left and right direct path transfer 
functions I ld(z) can be taken identical, as well as also the left and right 

25 cross-talk path transfer functions H x (z). 

It is readily seen that when the input signals L.R to the two virtual 
loudspeakers are identical, i.e. monophonic, no sound is reproduced at 
the listener's ears when H d Is equal in amplitude, but opposite in phase, 
30 to H x . In that case the sound propagating along the direct path is 
canceled completely out by the sound from the cross-talk path due to 
the oarlior discussed destructive interference effects. 

In a practical implementation of H a and H x> when designed for 
35 maximum stereo widening where virtual loudspeakers span 
substantially 1 80=, the aforementioned attenuation of the monophonic 
component occurs at frequencies centered around approximately GOO 
Hz. When virtual loudspeakers span 60° the attenuation occurs just 
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below 2 kHz. The frequencies where the attenuation of the monophonic 
component takes place depends on the amount of Hie time delay 
between the direct and cross talk paths (interaural time difference 1 1 D), 
which delay obviously depends on the location and span of the virtual 
5 loudspeakers. In principle, severe attenuation of the monophonic 
component may take place anywhere between 500 Hz and 2 kHz 
depending on the location and span of the loudspeakers, and the size 
of rhe head being modelled. 

10 Therefore, according to the invention the equalising of the output of the 
stereo widening network should take place so that the amplitude 
spectrum of the monophonic component of Ihe output signals can be 
maintained substantially flat in the aforementioned frequencies. I he 
most obvious use of the monophonic equalizer is lo compensate for a 

15 dip in the magnitude response at 600 Hz s but for the aforementioned 
reasons it can be typically useful for compensating for a dip in the 
magnitude response anywhere between 500 Hz and 2 kHz, 
Furthermore, it Is understandable to a skilled person that the frequency 
range to be used can in special circumstances be significantly different 

20 lhan the^above, for example from 400 Hz to 2.5 kl Iz, further, 
depending on the filtering applied, the monophonic signal may also be 
amplified somewhat outside the band. Still further, the filtering may 
cause *the amplification 6t the component to be unequal inside the 
band, e.g., the band may essentially be split in parts. 

25 

In order to understand the invention better in conceptual manner, one 
can consider a third virtual loudspeaker M positioned at straight front 
with respect to the listener (see Fig. 2). Sound emitted from this third 
loudspeaker M reproduces Identical sound pressures at the two ears of 

30 the listener- The basic idea of the invention conceptually is to use said 
speaker M to fill in the missing, attenuated energy In the monophonic 
component Thus, tho input to this virtual loudspeaker M is ideally a 
handpassed version of the monophonic component of signals L and R. 
optionally modulated by a time-varying gain g m whoso value depends 

35 on how similar stereo signals L and R are. The gain g m should be large 
when signals L and R are almost identical, i.e. highly monophonic (low 
stereophony), and the gain g m should be small when said signals L,R 
are very different (high stereophony). 
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There are various way?; to extract an estimate ol the amount of the 
monophonlc component, or correspondingly to estimate the amount of 
stereophony of the signals L,R. One method for estimating the 
5 stereophony is presented, for example, in patent publication bP 
955789. A simple appmanh Is to use the momentary average (L+R)/2 
of the left and right channel signals. The benefit of this approach is that 
the signal (L+R)/2 can be determined substantially instantaneously. A 
more sophisticated method could be the use of a coherence function 

10 between signals L,R. This may be understood broadly as the use of the 
history of the two channels in order to obtain an improved estimate of 
the component common to them, i.e. the similarity or correlation 
between the channels. This may be achieved, for example, by 
comparing the spectral values of the channels. For example, if a block 

15 of 20ms of samples of the signals is available, it is possible to calculate 
the spectrum of both channels, compare them with each other, and 
keep as the monophonic component only those frequency bands that 
contain roughly the same amount of energy. Multi-channel formats, 
which are likely to gain widespread use in the future, might provide 

20 other ways to extract the monophonic component, and other ways to 
mix in the monophonic component with the channels that are spatially 
processed. The 5.1 format, for example, includes a separate center 
channel- 

25 The center frequency and the bandwidth of the bandpass filter H m (7) 
responsible for providing the signal to the third virtual loudspeaker M 
must be matched to compensate for the attenuation of the monophonic 
component in the stereo widening network SW. Preferably the third 
virtual loudspeaker M Is positioned slightly further away from the 

30 listener than the left and right virtual loudspeakers L,R in order to 
prevent the narrowing of the soundstage caused by the added central 
sound source. In terms of signal processing this corresponds to adding 
a certain delay to the signal corresponding to the third virtual 
loudspeaker M. The additional delay incorporated in the transfer 

35 function H m (z) In order to do this should be of the order of 1 rns, but its 
exact value is not critical, and it can be also negative like -1 ms, or for 
example from -5 ms to 50 ms. It should be noted that in Fig, 2 a 
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common delay is removed, so that the transfer function H d (z), which 
represents the direct path, starts responding at time n=0. 

Figure 3 shows schematically a hlonk diagram of the monophonic 
5 equalizer ME attached as a "third" channel to a stereo widening 
network SW. Figure 3 also shows an optional preprocessing block PP 
in front of the stereo widening network SW for decorrelatfon of the 
stereo signals L,R before they enter the actual stereo widening network 
SW. The role of the preprocessing block PP is discussed in more 
1 o details later in this text. 

In this example the monophonic component of the stereo signals UR is 
estimated by the average signal (L+R)/2. The monophonic equalizer, 
Implemented by the gain g m which is optionally Ume-vaiying, and the 
15 digital filter z N H m (z) are contained in the "third" channel ME at the top. 

z" N is a pure delay of N samples, and H m (z) is typically a bandpass filter 
with a yenlle cul-on and cut-off slope. Such a filter can be implemented 
very efficiently by, for example, a second order Infinite Impulse 
20 Responser(IIR) filter section whose z-transform is given by 



25 An example of a suitable set dt parameter values at a sample rate of 
44.1 kl Iz are the following: 



The maximum gain of this IIR filter is 0 dB. Accurato equalization of tho 
35 monophonic component requires that the overall gain g m Is close to 1 
but in practice a value slightly above 0.5, which corresponds to 
approximately -5 dB, is found to work better. If g m is increased further, 
the spatial effect may suffer without any noticeable improvement in the 



H m 00 = 



(1) 



30 



bo-0.0277 r 
hi=0, 

b2=-0.0277, 

a 1= -1 .53382599561 9348, 
0^0.044574027361 73. 
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sound quality. The gain g m may be time varying or given a constant 
value. 

Figure 4 and b show examples of the magnitude response of a stereo 
5 widening network with and without the monophonic equalization 
according to the invention. ThR sampling frequency in these examples 
is taken to be 44.1kHz, and the equalizer transfer function H m (z} is a 
second order I1R filter whose output is delayed 55 samples relative to 
the I l d - 

10 

Figures 6 and 7 show examples of the Impulse response and 
magnitude response of H m (z) which to deliberately designed not to 
achieve very accurate equalization. 

15 It is clear for a person skilled in the art that in floating-point precision it 
is rather straightforward to implement the second order IIB filter H m (z) 
given above. However, implementation of IIR filters in tlxed-point 
precision is notoriously difficult, and for this reason we give here an 
example of how to run the monophonic equalizer according to the 

20 invention using only a very basic instruction set, i.e. software program 
code on a fixed-point platform such as a Pigital Signal Processor 
(DSP). 

i ' ... 
It is possible to run the monophonic equalizer without explicit 

25 multiplications. However, in order to process 16-bit audio it is 
necessary to use 32-bit variables internally. The implementation is 
based on a state variable description whose 2-by-2 feedback matrix 
contains the real and imaginary parts of the two conjugate poles, which 
are the roofs of the denominator nf the transfer function. The real parts 

00 are on the diagonal whereas the imaginary parts are off the diagonal, 
with a positive sign on tha element in the lower left comer and a 
negative sign on the clement in the upper right corner. It is much more 
accurate to approximate the positions of the poles In this way than it is 
to use the difference equation with coefficients that are approximations 

35 to the exact polynomial. This approach makes it possible to choose Qie 
pole positions as well as the other values of the parameters in the state 
variable description so that all multiplications can be calculated by 
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bitshifts and additions. The update equations for the filter H m (z) are 
defined by 



1-X7 IpJ 



(2) 



and 



'■64 



(3) 



where x, and x 2 are state variables, u is the input, and y is the output. 

15 An attenuation is buitt into said filter H m (z) so that its maximum gain Is 
around -5 dB. Consequently, if u is 16-bit audio signal, then y can also 
be stored In a 16-bit variable. The state variables x, and X2, however, 
must be 32 bit The parameters listed in liquations 2 and 3 are 
carefully chosen to ensure sufficient dynamic range without any risk of 

20 overflow, there are three or four bits headroom left even when the 
Input is highly compressed pop music, and the slgnal-to-noise ratio Is 
excellent. 



C O • 

o • 
a « 
* 

• * • 

a • » 



However, it should be noted that optimising the algorithm is a manual 
25 procedure, and it is necessary to go through it again if, for example, the 
filter H m (z) has to be designed for another sampling frequency. 
Therefore the aforementinnfiri should he understood as an example 
which is not limiting the possible embodiments of the invention. 

30 When the input is purely mohophonic, which moans that signals L,R 
are the same, decorreiation can be used to produce a pseudu-slereu 
signal which is further passed to the 6tereo widening network. Figure 3 
illustrates the use of an optional pre-processing block PP for 
decorreiation of the signals L,R prior to the stereo widening network 

35 SW- This lype or pseudo-stereo processing is often referred to as 
mono-to-3D- The monqphonic equalizer ME according to the invention 
also works well in this application since it strengthens the centre sound 
image at the frequencies where vocals and lead instruments have a 
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significant part of their energy- The invention improves the overall 
sound quality at the expense of a slight narrowing of the sound stage, 
just as it does for two-channel otcroo without decorrelation. 1 hus, the 
monophonlc equalizer ME according to the invention can be used in a 
5 'mild widening' preset for both mono and stereo inputs. 

The monophonic equalizer ME according to the invention can be used 
in connection with a large variety of different kind of spatial enhancers 
or stereo widening networks. Preferably, the invention Is used in 
10 connection with the balanced stereo widening network disclosed in the 
earlier patont application EP 1194007 by the Applicant In addition to 
the monophonlc equalizer ME disclosed here, said balanced stereo 
widening network can further be used together with different type of 
pre- and/or post-processing methods known as such, 

15 

It is therefore obvious for a person skilled in the art that the present 
invention is not restricted solely to the embodiments presented above, 
but il can be freely modified within the scope of the appended claims. 

20 II is possible to implement the method according to the Invention also 
by using analog electronics, but it is obvious tor anyone skilled in the 
art thai the preferred embodiments are based on digital signal 
processing techniques. I he digital signal processing structures may 
also be other than UR structures, for example, Finite Impulse Response 

25 (FIR) structures. 

In the previous examples the monophonlc signal component is first 
extracted from the left and right input signals, and the bandpass 
filtering and also other processing steps directed to said signal 

30 component are performed after that However, it is also possible to 
constant the monophonlc signal path ME In such a way that the 
bandpass filtering is performed before the other processing steps. In 
some applications this can he advantageous. For example, if the 
bandpass filtering is performed first, it is possible to downsample both 

35 the left and right channels before applying a possibly very sophisticated 
algorithm for the extraction of the monophonic component. Therefore, 
the processing steps contained in the monophonic signal path ME may 
be performed in any appropriate order respect to each other. 



c 

< L 
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me disclosed Invention is especially intended for converting audio 
material having signals in the general two-channel stereo format for 
headphone listening. This includes all audio material, for example 
5 speech, music or effect sounds, which are recorded and/or mixer! 
and/or otherwise processed to create two separate audio channels, 
whioh 3aid channels can also further contain monophonic components, 
or which channels may have been created from a monophonic single 
channel source, for example, by decorrelation methods and/or by 
10 adding reverberation. This also allows the use of the method according 
to the invention for improving the spatial impression in listening 
different types of monophonic audio material. 

The media providing the stereo signals for processing can include, for 
1 5 example, CompactDisc, MiniUisc, AAC or any other digital media 
including public TV, radio or other broadcasting, computers and also 
telecommunication devices, such as mobile or multimedia phones, 
PDA's, web pads etc. Stereo signals may also be provided as analog 
signals, whidh, prior to the processing in a digital network, are first AD- 
20 converted^ 

The signal processing device according to the invention can be 
incorpofated into different types ol portable, mottle appliances, such as 
portable players or communication devices, but also into non-portable 
25 devices, such as home stereo systems or PC-computers. The 
Implementation of the monophonic equalizer may bo hardware or 
software baser), or the practical implementation may be a suitable 
mixture of those depending on tho specific application. 

30 
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Claims: 

1. A method in stereo widening (SW) or corresponding spatial signal 
processing of stereo format signals to hecome suitable for headphone 

5 listening, which method comprises at least tho steps of 

forming left and right channel signal paths (Ld, FW in order to 

process the left and right ohannel input signals (L,n>FW into left 
and right channel output signals (Um.Row). and 

— forming at least one delay introducing oross-talk signal path (L„ 
10 R*) between the left and right channel signal paths (U. Rn). 

characterized in that the method further comprises the step of forming 
a separate monophobia signal path (ME) in order lo equalize the 
frequency spectrum of the monophonic component of the left and right 
output signals (i-ouuR^O by at least 
15 — extracting from the left and right input signals (U„Rin) an at least 

substantially monophonic signal component contained in said 

signals (Li n ,Rm), 

— processing the monophonic signal component to obtain a 
processed monophonic signal component, and 

20 — combining said processed monophonic signal component with at 
least one of the left (U,,) and the right (R oul ) output signals. 

2. The method according to claim 1 , characterized in that the at least 
substantially monophonic signal component is extracted from the left 

25 and right input signals (L in ,R in ) based on the momentary average value 
(L+R)/2 of said signals. 

3. The method according to claim 1, characterized in that the at least 
substantially monophonic signal component Is extracted from the left 

30 and right input signals (Li m Rin) based on the similarity between said 
signals. 

4. The method according to claim 1, characterized In that the 
processing of the monophonic signal component includes processing 

35 nf the frequency spectrum of said signal component. 

5. The method according lo claim 4. characterized in that the 
processing of the frequency spectrum of said signal component is 
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performed substantially within a frequency range ranging from 500 Hz 
to 2 kHz. 

6. The method according to claim 1 , characterized in that the 
5 processing of the monophonic signal component includes adjustment 

of the gain of said signal component. 

7. The method according to claim 6, characterized In that the 
adjustment of the gain is performed in a time varying manner- 

10 

8. The method according to claim 1, characterized in that the 
processing of the monophonlc signal component includes adding a 
delay to said signal. . 

15 9. A signal processing device for stereo widening (SW) or 
corresponding spatial signal processing of stereo format signals to 
become suitable for headphone listening, the device comprising at 
least 

— left and right channel signal paths (U R<j) in order to process the 
20 left and right channel inpul signals (L ini Rin) into left and right 

channel output signals (Uui.Roui)» and 

— at leasl one delay introducing cross-talk signal path (L x , R x ) 
between the left and right channel signal paths (L^, H d ) 1 

characterized in that that the device further comprises separate 
25 monophonic signal path (ME) in order to equalize the frequency 
spectrum of the monophonic component of the left and right output 
signals (Lout»H ou i), said monophonic signal path (ME) comprising at 
least 

— means for extracting from the left and right input signals (L^R^) an 
30 at least substantially monophonic signal component contained in 

said signals (L,n,Rin), 

— means for processing the monophonic signal component to obtain 
a processed monophonlc signal component, and 

- — means for combining said processed monophonic signal 
3b componnnt with at teast one of the left (UuO or the right (R out ) 

output signals. 
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10. The device according to claim 9. characteriz d In that the means 
for extracting the at least substantially monophonic signal component 
from the left and right input signals (L,n,Ri„) are based on determining 
the momentary average value (L+R)/2 of said signals. 

5 

11. The device according lo claim 9, characterized in that the means 
for extracting the at least substantially monophonic signal component 
from the left and right input signals (L in ,R;n) are based on the similarity 
between said signals. 

10 

12. The device according to claim 9, characterized in that the means 
for processing the monophonic signal component include means for 
processing of the frequency spectrum ot said signal component. 

15 13. The device according to claim 12, characterized in that the means 
lor processing the frequency spectrum of said signal component 
comprise a digital Infinite Impulse Response (IIH) or a FiniTe Impulse 
Response (FIR) filter structure. 

20 14. The device according to claim 12 or 13, characterized in mat the 
processing bl the frequency spectrum of said signal component Is 
performed substantially within a frequency range ranging from 500 Hz 
to 2 kHz. 

25 lb. The device according to claim 9. characterized In that the means 
for processing the monophonic signal component include means for 
adjusting The gain of said signal component. 

16. The device according to claim 15. characterized in thai Ihe means 
30 for adjusting the gain are arranged to perform the adjustment in a time 

varying manner. 

17. The device according to claim 9. characterized in thai the means 
for processing the monophonic signal component include means for 

35 adding a delay to said signal. 

18. The device according to claim 9, characterized in that the device is 
a digital signal processing device. 
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19. A computer program comprising machine executable steps, 
characterized in that it is arranged to carry out the method steps 
according to any of The aforementioned claim 1-8. 

5 20. A computer program according to claim 29, characterized in that it 
is arranged to be executed in a digital signal processor. 

21. A mobile appliance with audio capabilities, characterized in that it 
comprises a signal processing device according to any of the 
aforementioned claim 9-17. 

10 22. A mobile appliance according to claim 21, characterized in that it 
is a portable digital player or a digital mobile telecommunication device. 
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Abstract : 

The Invention relates to a method, signal processing 
device and computer program for stereo widening (SW) 
of otoroo format signals to become suitable for 
headphone listening. The invention also relates to a 
mobile appliance performing signal processing 
according to the Invention. According to the Invention a 
separate monophonic signal path (ME) is formed in 
order lo equalize the frequency spectrum of the 
monophonic component of the left and right output 
signals (L«,r,FU) by at least extracting from the left and 
right input signals (Cta,R iB ) an at least substantially 
monophonic signal component contained in said signals 
(Ljn.Rin), processing the extracted monophonic signal 
component to obtain a processed monophonic signal 
component, and combining said processed monophonic 
signal component with at Iea3t one of the left (L^t) or the 
right (Row) output signals. 

Fig 3 , 

5 
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